Exome sequencing identifies novel genetic variants associated with varicose veins

Background Varicose veins (VV) are one of the common human diseases, but the role of genetics in its development is not fully understood. Methods We conducted an exome-wide association study of VV using whole-exome sequencing data from the UK Biobank, and focused on common and rare variants using single-variant association analysis and gene-level collapsing analysis. Findings A total of 13,823,269 autosomal genetic variants were obtained after quality control. We identified 36 VV-related independent common variants mapping to 34 genes by single-variant analysis and three rare variant genes (PIEZO1, ECE1, FBLN7) by collapsing analysis, and most associations between genes and VV were replicated in FinnGen. PIEZO1 was the closest gene associated with VV (P = 5.05 × 10−31), and it was found to reach exome-wide significance in both single-variant and collapsing analyses. Two novel rare variant genes (ECE1 and METTL21A) associated with VV were identified, of which METTL21A was associated only with females. The pleiotropic effects of VV-related genes suggested that body size, inflammation, and pulmonary function are strongly associated with the development of VV. Conclusions Our findings highlight the importance of causal genes for VV and provide new directions for treatment.


Introduction
Varicose veins (VV) are the most common chronic condition of the venous system, often affecting the lower extremities and manifesting as dilated, stretched, or tortuous superficial veins [1].The majority of VV patients suffer from complications such as pain, swelling, hyperpigmentation, and ulcers.About 23% of adults aged 40-80 in the United States develop VV, including 22 million women and 11 million men [2].Today, endovenous laser ablation is considered clinically the first-line treatment option for VV but has a post-operative recurrence rate of up to 20% [3,4].Therefore, it is particularly important to define the etiology of VV.Previous epidemiological studies have shown that VV are associated with several risk factors, including advanced age, being female, pregnancy, obesity, height, and history of deep vein thrombosis [5][6][7][8].A positive family history is one of the important risk factors for VV, which suggests that VV are likely to be modulated by genetic factors [9][10][11][12].Most of the previous genetic studies on VV were genome-wide association studies (GWAS) [13][14][15].Within the last decade, dozens of genomic loci associated with VV have been identified.However, GWAS are more limited in scope and mostly focused on common variants (minor allele frequency (MAF > 1%)), which usually have a small effect size and cannot directly identify the causal gene.
In contrast to traditional studies, large-scale whole-exome sequencing (WES) analysis is more capable of identifying rare genetic variants (MAF < 1%) in diseases [16].Rare variants are genetic markers of high disease risk and help to identify novel genetic targets for drug interventions [17].Apart from revealing the full spectrum of protein-coding variants, WES can facilitate the identification of novel loss-of-function (LOF) variants and enhance the ability to detect the associations between LOF variants and diseases.LOF variants can identify drivers of genetic risk, novel disease genes, and therapeutic targets [16].Also, large-scale sequencing can evaluate the disease prevalence and carrier frequencies of rare genetic variants [18].A recent study focused on the effect of PIEZO1 on altered VV risk and identified the association of VV with rare protein-truncating variants in PIEZO1 [19].A multi-phenotype exome-wide association study (ExWAS) from the UK Biobank (UKB) involving 49,960 participants [16] identified novel LOF variants with significant disease impact, including PIEZO1 on VV (P = 3.2 × 10 −8 ).In addition, a phenome-wide level ExWAS [20] exploring the contribution of rare variants to human disease also found the association between PIEZO1 and VV (P = 3.24 × 10 −24 ).In contrast to their studies, we conducted a more comprehensive ExWAS of VV using the updated exome sequencing data of more than 350,000 UKB individuals, with additional validation and exploration of the findings and biological functions.We identified 19 known and 17 novel causal genes significantly associated with VV, with most of these associations validated in FinnGen [21].In addition, our study explored the phenotypic associations between VV genes and multiple traits, showing that cardiovascular disease, height, biochemistry, and inflammatory indicators might contribute to the development of VV.

Participant characteristics
In this study, we performed an ExWAS by utilizing phenotypic and genetic data from UKB, including exome sequencing data and VV diagnosis data (S1 Table ).Following stringent quality control (QC) of genotypes and samples, we identified 350,770 unrelated Caucasian participants (21,

Exome-wide single-variant association analysis for VV
The mixed-effects logistic models adjusted for age, sex, and the top 10 principal components (PCs) were used to assess the association between VV and common coding variants.Firstly, 169 variants and 114 variants were found to reach the exome-wide significance (P < 1.19 × 10 −6 ) and the genome-wide significance (P < 5 × 10 −8 ) in the single-variant analysis (Fig 1A and S3 Table ).After clumping 169 variants in strong linkage disequilibrium, 36 independent common variants associated with VV were identified (Fig 1A), of which 34 were successfully validated in FinnGen [21] (Table 1 and S4 Table).Among these independent common variants, 19 were significantly protective against VV, while the other 17 were significantly associated with an elevated risk of VV (Table 1).In addition, 36 independent common variants were mapped to 34 genes, of which 18 have been reported previously and the other 16 genes were novel (including TRIM10, UBE2H, TUBAL3, DUSP8, DNAH10, CAPRIN2, MSL1, ZBTB4, CIB3, SERPIND1, SHANK3, REST, CTXN3, H2AC6, PGBD1, ABHD16A).PNO1 and PIEZO1 had the most significant associations with VV, and the P-values were 9.81 × 10 −44 and 1.75 × 10 −39 , respectively.TRIM10 showed the strongest association with VV among 16 novel common variant genes (P = 5.29 × 10 −14 ).The quantile-quantile (Q-Q) plot of single common variants is shown in Fig 1B .It was noteworthy that these independent common variants were characterized by larger effect sizes at lower frequencies (Fig 1C).Furthermore, we conducted a conditional analysis of the identified common variants linked to VV using conditional and joint analysis (COJO).The results revealed that 27 independent common variants were retained, 25 of which were consistent with variants identified after clumping (S5 Table ).
Considering the high prevalence and incidence of VV in the general population, we calculated the carrier frequencies and disease prevalence rates of the putative pathogenic variants.Genes highlighted in bold are not previously reported.rsID was obtained from FinnGen.The "/" represents the variants not tested in FinnGen.Abbreviations: VV, varicose veins; CHR, chromosome; POS, position; OR, odds ratio; CI, confidence interval.

Burden heritability of VV
Then we calculated the gene-wise burden heritability of VV using the method developed by Weiner et al. [22].In this analysis, ultra-rare variants were defined as MAF < 1 × 10 −5 and rare variants were defined as having a MAF between 1 × 10 −5 and 1 × 10 −2 .The burden heritability of rare variants focuses on variants that predict the most serious functional consequences.Among these, ultra-rare coding LOF variants explained 0.287% (se = 0.056%) of phenotypic variance, more than rare LOF variants explained.But rare missense variants accounted for more burden heritability than ultra-rare (rare, 0.053%; ultra-rare, -7.63

Robustness of the association of rare variant genes with VV
We performed leave-one-variant-out (LOVO) analyses to assess the robustness of associations identified in gene-based collapsing analysis and to find the variants that affect each VV-related rare variant gene (S11 Table ).After the removal of the variant chr1:21227980:A:G, the association between ECE1 and VV became non-significant (P = 3.81 × 10 −6 ).Similarly, the highest LOVO P value attained for the association was P = 9.90 × 10 −3 for FBLN7 and VV after removing chr2:112165024:C:G.Importantly, the association of VV and PIEZO1 remained robust in the LOVO analysis (S5 Fig) .This suggested that the significant associations of VV with ECE1 and FBLN7 were strongly influenced by single rare variants, whereas the association with PIEZO1 were more influenced by a combined effect of multiple rare variants.We further assessed whether the significant rare variants were independent of nearby common variants by a conditional analysis.The results showed that the nearby common variants did not affect the effect size and significance of these three rare variant genes (S12 Table ).

Sensitivity analysis of the effects of sex on VV
To explore whether the effects of genes on VV differ by sex, we re-performed single-variant analysis and collapsing analysis after stratifying participants by sex.These analyses included 162,210 males and 188,560 females.After clumping, we identified 5 and 21 independent common variants significantly related to VV in males and females, respectively, with four variants mapping to novel genes only in females (ARHGEF26, BTN1A1, H1-5, TRIM27, S13 Table ).

Enrichment analysis of VV genes and their expression
We conducted a pathway enrichment analysis using the Gene Ontology (GO) Consortium [23,24] for all 36 genes identified in the study as being significantly associated with VV.There were 12 GO terms significantly enriched for these genes (false discovery rate (FDR) correction, corrected α = 0.05).It was found that several VV genes (including UBE2H, TRIM31, and RNF5) were significantly enriched in protein K48-linked ubiquitination (P = 7.20 × 10 −3 ) and 30 genes were enriched in protein binding (P = 2.79 × 10 −2 ) (Fig 3A and S15 Table ).In females, GO enrichment analysis showed that the identified VV-related genes were significantly enriched in protein ubiquitination (P = 1.47 × 10 −3 ) and most genes were enriched in protein binding (P = 1.48 × 10 −2 ) (Fig 3B and S16 Table ).However, the VV genes identified in males were not successfully enriched for the same biological processes.
In addition, we explored the expression of VV-related independent common and rare variant genes in 30 general tissues in FUMA GTEx.We discovered that several genes were highly expressed in adipose tissue, including PIEZO1 and the novel identified ECE1 (Fig 3C and S17 Table ).Next, single-cell expression data from human adipose tissue showed that PIEZO1 and ECE1 were most strongly expressed in the endothelial cell subpopulation, and higher expression of ECE1 was also detected in fibroblasts and neutrophils (Fig 3D , 3E and 3F).
In addition, we tried to find genes with similar continuous trait fingerprints to these rare variant genes on the Gene-SCOUT website [25].SCUBE3 and IGF2BP2 were found to be the most similar to the PIEZO1, respectively (Fig 6A).The module of gene signatures showed that both these similar genes and PIEZO1 were negatively associated with height related traits (Fig 6B).Recent evidence using machine learning methods [15] has identified that height is a risk factor for VV.Our findings further validated the effect of height on the risk of developing VV.

Discussion
Previous GWAS [14,15,26] have identified multiple risk loci and pathways involved in the pathophysiology of VV, improving the understanding of the polygenic architecture of the disease.The WES data from the UKB could provide a new direction for exploring the genetic mechanisms of VV [27].VV-related genes and genetic variants identified in population-based WES studies were more abundant and reliable.It is perhaps that whole-exome data are usually obtained directly from sequencing rather than by imputation.WES typically focuses on coding regions of the genome and is more cost-effective and widely available than whole-genome sequencing [28].Compared to non-coding variants, WES can directly assess variants that alter protein sequence, making it easier to explain their functional consequences.It also provides a clearer pathway to deeper mechanistic insights, and can potentially be useful for therapeutic target discovery [29][30][31][32] and precision medicine [33,34].
Recently, several studies of human gene-phenotype association for rare coding variants have identified the association between PIEZO1 and VV by using exome sequencing data from UKB participants [16,20].Compared to the studies by Van Hout et al. [16] and Wang et al. [20], our study has a larger sample size and more validation of the findings (replication in the Finn-Gen cohort) and exploration of biological function (e.g.functional enrichment analysis, tissue expression analysis, and single-cell RNA sequencing analysis).In addition to the replication of 19 known VV-related genes from previous studies, our study further identified 17 novel genes.Most of these genes were successfully validated in FinnGen.Tissue analysis and single-cell gene expression analysis emphasized the importance of adipose tissue in the association between these identified genes and VV risk.Our Phe-WAS revealed the strong associations of these genes with body size measures, biological indicators, and inflammatory markers.
We identified a total of 36 VV-associated genes in our large-scale exome sequencing study.First, the exome-wide single-variant analysis identified 36 VV-related independent common variants mapping to 34 genes, half of which have been previously established [14,15,26,35,36].The significant effects found in the single-variant analysis were the effect of PNO1 on chromosome 2 and that of PIEZO1 on chromosome 16.PNO1 plays an important role in both proteasome and ribosome biogenesis [37], and PIEZO1 is required for vascular development and function [38].PNO1 is predominantly expressed in the liver and spleen, and slightly in the thymus, testis, and ovary, but not in the heart or brain [39].In addition, it has been suggested that PNO1 may be involved in the expression of calcineurin phosphatase, a key signaling component in angiogenesis, and calcineurin phosphatase activity is essential for vascular development [15,40].As the most significant novel gene identified in the single-variant analysis, TRIM10 was thought to be potentially involved in the body's immune response [41].Phenotypic association analysis revealed that TRIM10 had associations with several inflammatory markers, such as platelets, monocytes, and neutrophils.As a protein-coding gene, UBE2H encodes an E2 ubiquitin-conjugating enzyme family protein, which is involved not only in ubiquitination but also in cytoskeletal regulation and calcium signaling [42].Our study also indicated that UBE2H is involved in protein binding and protein K48-linked ubiquitination by functional enrichment of the gene.
The other findings in this study were the associations of VV with three rare variant genes (PIEZO1, ECE1, and FBLN7) identified by the gene-based collapsing analysis.Both single-variant analysis and gene-based collapsing analysis revealed an association between VV and PIEZO1, supporting the previously reported hypothesis that common and rare variants may involve overlapping genes [43,44].A study by Backman et al. [43] found that missense variants in PIEZO1 might have a gain-of-function effect and reduce VV risk, which offers the possibility of treating VV by pharmacological interventions.PIEZO1 is a mechanically activated ion channel that plays a decisive role in vascular architecture [38,45].Its integral or endothelialspecific disruption can severely damage the growing vascular system and even lead to embryonic death.
As a newly discovered VV-related gene, ECE1 is an estrogen-regulatory gene in mesenteric arteries that could catalyze the conversion of pro-endothelin into endothelin-1, a potent vasoconstrictor [46,47].A mouse model showed that estrogen receptor stimulation suppressed ECE1 expression [48].In addition, previous studies have found an increase in ECE1 in pathological conditions, for instance, hypertension, coronary atherosclerosis, and vascular injury models [49][50][51][52].Our expression analysis of ECE1 revealed its high expression in adipose tissue, especially in endothelial cells, and a positive association between obesity and VV risk has been demonstrated before [15,26].It was well known that the development of VV was associated with a variety of vascular factors, such as changes in hemodynamics, endothelial cell activation, inflammation, and hypoxia [53][54][55].As a cell adhesion molecule associated with diseases and developmental abnormalities, FBLN7 plays an essential role in specialized tissues such as the placenta, blood vessels, and cartilage [56][57][58].FBLN7 protein fragments have been found to regulate endothelial cells and leukocyte function [59,60].With anti-angiogenic function [57], they can be used to treat inflammatory diseases by modulating immune cell activity [61].
Moreover, although the carrier frequencies of LOF variants in the population were low, LOF mutations confer greater disease prevalence and genetic susceptibility to VV.As for the burden heritability, ultra-rare LOF variants had the highest heritability, followed by rare LOF variants and rare missense variants, which was in line with the previous finding by Weiner et al. [22] that LOF variants explained the majority of burden heritability.
We evaluated the robustness of the three genes identified in the collapsing analysis.The rare variant association results remained stable after adjusting for nearby common variants, probably due to our strict variant filter.Our sensitivity analysis revealed a novel gene in females (METTL21A), which prompted us to speculate whether the association between VV and sex might be influenced by genetic factors.Moreover, the current study revealed an overlap between VV-related genes identified by single-variant analyses and those identified by burden tests, suggesting the robustness of the association between overlapping genes and VV risk.A previous genetic study showed that vascular diseases were tightly linked to anthropometric indices [36].The Phe-WAS of the identified genes showed VV genes were also associated with VV-related phenotypes, such as cardiovascular disease, body size measures, and pulmonary function indices.Phe-WAS can further explore genetic contributions to disease and identify the mechanisms of sharing across diseases [62].
There were some limitations in our study.First, the UKB was generally recruited from populations of European ancestry between the ages of 37 and 73, which makes our findings potentially inapplicable across all ethnic and age groups.Second, disease diagnoses are typically confirmed based on participant self-reports, ICD diagnosis codes, surgical records, and death registries.This may lead to the misclassification of disease phenotypes.Nevertheless, previous studies have been able to replicate genetic loci findings for common variants using the phenotype definitions in GWAS [14,15,26].Third, we focused on the coding regions of the genome captured by WES, which may lead to the neglect of variants in non-coding regulatory regions.Due to the difficulty in obtaining exome sequencing data for VV from other databases, we validated our findings using summary statistics from FinnGen.The strength of our study is that it has the largest sample size of any VV-related whole-exome association study to date.Exomewide analyses allow for more efficient targeting of variant genes that play an essential role in complex diseases compared to GWAS [63].We defined the VV phenotype mainly based on the study of Ahmed et al. [14], which successfully replicated most of the VV-related risk loci identified in previous studies using this definition [15,64].We additionally validated the results in FinnGen to determine the reproducibility of our findings.
In conclusion, large-scale exome-wide association studies are required for exploring the genetic associations of diseases.In this study, we identified many known genes and several novel effector genes significantly associated with VV.We found that LOF variants explained more burden heritability than missense variants.Furthermore, investigating genotype-phenotype associations would be critical for our insight into the underlying biological mechanisms of the disease, as well as for disease prevention and treatment.The study of genetic variants and diseases helps to develop strategies to target risk genes for prevention and treatment purposes.

Ethics statement
The UKB has received ethical approval from the National Health Service National Research Ethics Service and obtained written informed consent from all participants.This study was conducted under application number 19542.

Study population
The UKB is a large-scale prospective cohort of nearly 500,000 individuals aged 37-73 recruited between 2006 and 2010 at 22 centers in England, Scotland, and Wales [65].Each participant provided biological samples, phenotypic endpoints, registry information on cancer and death, and health-related information, including biometric measures, lifestyle indicators, biomarkers in blood and urine, and imaging data [66].
In the discovery cohort, the VV cases were identified from the UKB using the following diagnostic, operative, and self-report codes (S1 Table ): primary and/or secondary ICD-10 codes for varicose veins (I83), primary and/or secondary OPCS code for varicose vein surgery (L84-L88), self-reported operation code for varicose vein surgery (1479), self-reported noncancer illness code for varicose veins (1494).Finally, 30,130 individuals who held at least one of these codes were classified as cases of VV.

Sequencing and quality control
The Regeneron Genetics Center (RGC) performed whole-exome sequencing on all participants from the UKB.The complete sequencing protocols were described in detail in the previous study [16].Samples were subjected to double exponential 75 × 75bp paired-end reads using the IDT xGen Exome Research Panel v1.0 on the NovaSeq 6000 platform to capture exomes with over 20X coverage at 95.2% of sites.We performed similar additional QC on the OQFE WES pVCF files in the GRCh38 human reference genome construct [67] provided by the UKB (https://biobank.ctsu.ox.ac.uk/showcase/label.cgi?id=170), which was similar to that in a previous study [68].Due to the limited quality control (QC) and filtering of the published dataset, we used a high-quality dataset generated from an extensive genotype, variant, and sample-level pipeline for further analysis (S1 Text).We utilized Hail to perform the genotype quality control and split multi-allelic sites into bi-allelic sites.All calls with low genotype quality or critically low/high genotype depth were set to no calls.And then we excluded variants based on the call rate < 90%, Hardy-Weinberg P-value < 1 × 10 −15 , and in Ensembl low-complexity regions.We selected a set of high-quality independent autosomal variants for the relationship inference, which was selected as: MAF > 0.1%, missingness < 1%, Hardy-Weinberg equilibrium (HWE) P-value > 1 × 10 −6 and two rounds of pruning using-indep-pairwise 200 100 0.1 and-indep-pairwise 200 100 0.05.KING [69] software was applied to calculate pairwise heterozygote concordance rates for each pair of samples and the kinship coefficient using high-quality variants.Sample-level QC involved removing samples with Ti/Tv, Het/Hom, SNV/indel, number of singletons exceeding the mean ± 8SD, and discrepancies between selfreported and genetically inferred sex, as well as duplicates and withdrawn consent.The kinship coefficient threshold of 0.0884 (the 2nd relatedness) was used to define a related sample.For pairs with kinship coefficients higher than 0.0884, we removed samples associated with multiple other individuals iteratively until none remained to maximize the sample size.Then we randomly removed one from the remaining pairs.Finally, we restricted the sample to Caucasians (field 22006) and used high-quality independent autosomal variants to calculate the PCs within the descent.

Variant annotation
The preliminary analysis was performed by using SnpEff [70] to determine the most severe predicted consequences for specific variants in each gene transcript, with variants filtered to rare.We classified the variants annotated as stop gained, start/stop lost, splice donor/acceptor, or frameshift as LOF variants.A missense variant was considered damaging when it was predicted to be deleterious by SIFT (sorting intolerant from tolerant) [71], PolyPhen-2 HDIV, PolyPhen-2 HVAR [72], LRT (Likelihood Ratio Test) [73], and MutationTaster [74,75] consistently.The definitions of deleterious variants for each software are given in S1 Text.

Single-variant association analysis
Single-variant association analysis of common exonic single nucleotide polymorphisms (SNPs) in VV was performed using SAIGE [76] software.The analysis was adjusted for age, sex, and the top 10 PCs, and the significance threshold was set at 1.19 × 10 −6 (Bonferroni correction for 42,123 exonic SNPs).Then, we clumped the results to remove possible associations due to strong linkage disequilibrium (LD) using the-clump function in PLINK v2.Significant SNPs within each risk locus were selected to identify the lead SNPs, which were defined as those with the lowest P value when a strong LD (r 2 > 0.01) was observed for multiple SNPs within a 1000 kilobases (kb) window.Risk loci were identified as regions of ±1 Mb around lead SNPs.

Gene-based exome-wide association analysis
We conducted gene-based collapsing analysis to define rare variant genes associated with VV.Rare variants were defined as MAF < 1% [77].In our preliminary analysis, carriers of rare LOF variants and missense variants were decomposed into individual variables for each gene.We performed two functional category masks: LOF and LOF + missense variants, and subdivided each functional mask into four frequency masks: < 0.01, < 0.001, < 0.0001, and < 0.00001.It means that we constructed eight gene-burden models for each gene.Following adjustment for age, sex, and the top 10 PCs, gene-based collapsing analysis was conducted in SAIGE-GENE+ software [78] using a logistic mixed model approach with a saddle-point approximation.This approach is effective for the analysis of large sample data, and provides accurate P-values even when the case-control ratio is highly unbalanced [76].In our collapsing analysis, we mainly reported the P value of SKAT-O test [79], and the exome-wide significance threshold (P = 2.51 × 10 −6 ) was determined by employing the Bonferroni correction (0.05/ 19,897).After identifying the VV-related genes, we further calculated the disease prevalence and carrier frequencies of rare variants for these genes.

Burden heritability estimation
In this study, we estimated the heritability explained by the rare coding variants using burden heritability regression (BHR) [22].BHR estimates burden heritability from the regression slope of the gene burden statistic on the burden score.We focused on estimating burden heritability for two functional categories (LOF and missense variants) and two allele frequency bins (MAF < 1 × 10 −5 , and MAF between 1 × 10 −5 and 1 × 10 −2 ).Ultra-rare coding variants were defined as MAF < 1 × 10 −5 in this study.A single-variant association test was performed on the VV to obtain summary statistics of variant-level association.The "aggregate" mode was used to calculate the total burden heritability of disease across multiple frequency-function strata.Single-variant association summary statistics were used as input utilizing the baseline-BHR file provided by Weiner et al. [22].These analyses were performed based on the R package bhr (v0.1.0).

Replication of genes in the FinnGen cohort
We replicated the VV-related genes that we have identified using summary statistics from the FinnGen Consortium online results (version 8, 21).The FinnGen cohort recruited 342,499 individuals of Finnish descent with genotype and health register data, including 26,720 VV cases and 295,014 controls.The cohort collected and confirmed participants' diagnostic information through national healthcare registries and used the International Classification of Diseases (ICD) for documentation.Genetic association tests were performed using logistic mixed models in SAIGE, and GWAS summary statistics for "Varicose veins" are publicly available online (see Data availability).To validate the findings of single-variant association analysis, we directly searched for our lead SNPs in summary statistics.For VV-related genes identified in gene-based collapsing analysis, we selected the variants with the strongest FinnGen VV correlation mapped to these genes.The significance threshold used for validation in the summary statistics was P < 0.05.

Gene set enrichment and expression analyses
To identify terms and metabolic pathways with functional enrichment in differentially expressed genes, we performed the GO enrichment analysis using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [80].The enrichment significance threshold was set at 0.05 (correction method = "FDR").Normalized gene expression (transcripts per million, TPM) was obtained from the Genotype-Tissue Expression (GTEx) v8 for 30 general tissue types using the GENE2FUNC core program in Functional Mapping and Annotation of Genome-Wide Association Studies (FUMA GWAS) [81].We used the average log 2 (TPM) to compare expression levels between genes and tissue types.Data from each gene set was enriched for testing and multiple corrections (default for Benjamini-Hochberg).We leveraged the adipose tissue single-cell RNA sequencing data provided by the Tabula Sapiens Consortium to identify the expression levels of the genes in different cell types [82].The Tabula Sapiens is a single-cell transcriptomic atlas of 24 human tissues and organs from 15 unique donors.The adipose tissues from two donors were obtained from surgery and immediately prepared from a fluorescence-activated cell sorting-sorted smart-seq2 platform or 10x Genomics 3' V3.1 droplet-based sequencing platform.The quality control, data integration, clustering, and cell type annotation procedures of single-cell RNA sequencing data were conducted, as described in the previous study [82].The R package Seurat was used for downstream data analysis (i.e., normalization and data scaling) and visualization [83].

Sensitivity and leave-one-variant-out analyses
As being female is one of the risk factors for VV, sensitivity analyses stratified by sex were performed to explore the sex heterogeneity of the genetic associations identified.There were 162,210 participants in the male group and 188,560 in the female group after stratification by sex.Enrichment analysis was then performed separately for the VV-related genes identified by sex.In addition, we assessed the robustness of the genetic results in the variant masks using LOVO analysis.After removing each variant from the significant models one at a time, we reran the association analysis for the variants included in the original mask.The variant that reached the maximum LOVO P value after removal was regarded as the most important variant in each gene [68].

Conditional analyses
We conducted multi-SNP-based conditional and joint association analyses (GCTA-COJO) using GCTA software [84,85] to identify conditionally independent variants from common variants associated with VV and to validate the results after clumping.The significance threshold was set at P = 1.19 × 10 −6 , and other parameters were left at their default settings.
We performed the conditional analysis to demonstrate that the significant rare variant signals were independent of nearby common variants.First, we ran association analyses for common variants (MAF > 0.5%) within the ±500 kb genomic region of the identified genes.Then we used the-clump function (-clump-p1 1 × 10 −5 ,-clump-r 2 0.01) in PLINK [86] to aggregate and threshold the results, identifying independent index common variants.Finally, gene-based collapsing analysis was rerun after adding the clumped common variant to the model as a covariate [68].

Phenome-wide association analyses
To better explain the mechanisms of associations between risk genes and VV, we performed Phe-WAS.This analysis mainly focused on 53 phenotypes selected from UKB that may be associated with VV, including cardiovascular disease, 29 biochemical traits, 4 body size traits, 9 inflammatory traits, and 10 spirometry traits.Associations of 53 phenotypes with rare variant genes associated with VV were conducted using gene-level collapsing analysis (P < 0.05/159), whereas linear or logistic models were applied to independent common variants (P < 0.05/ 1,802).All models were adjusted for age, sex, and the top 10 PCs.To correct for multiple testing, the Bonferroni test was utilized.

Gene-SCOUT
The Gene-SCOUT (Gene-Similarity from Continuous Traits, https://astrazeneca-cgrpublications.github.io/gene-scout/)[25] aims to find genes similar to specific genes and to construct a unique signature for each gene using associations derived from 450,000 exomes and 120,000 samples of metabolomic data sequenced from the UKB.This tool was used to identify the genes most similar to VV-related genes and to show significant associations of these genes with all quantitative traits.

Fig 1 .
Fig 1. Summary of single-variant associations with VV among the unrelated Caucasians.(A) Manhattan plot of single-variant associations for common variants and VV.The red solid line indicates the significance threshold (P = 1.19 × 10 −6 ), and the red dashed line indicates the genomewide significance threshold (P = 5.0 × 10 −8 ).The gene name corresponds to the gene closest to the variant.Black fonts indicate previously identified genes, while red fonts indicate newly identified genes.(B) The quantile-quantile (Q-Q) plot for single-variant analysis.(C) The plot of effect size (odds ratio) versus effect allele frequency of identified independent common variants.Abbreviation: OR, odds ratio.https://doi.org/10.1371/journal.pgen.1011339.g001 https://doi.org/10.1371/journal.pgen.1011339.t001PIEZO1 had the highest carrier frequency of missense variants (3.77%, the probability of being loss-of-function intolerant (pLI) = 0.54), followed by FBLN7 (1.47%, pLI = 0) and ECE1 (0.20%, pLI = 1) (Fig 2B and S8 and S9 Tables).The high pLI of ECE1 underscores its pronounced intolerance to LOF variants, consistent with our findings of lower carrier frequency (S4 Fig).These variant carriers showed generally modest disease prevalence rates (< 15%), and the disease prevalence rates of missense variants were lower than those of LOF variants in carriers (Fig 2C).

Fig 2 .
Fig 2. Summary of rare genetic associations with VV among the unrelated Caucasians.(A) Manhattan plot of the gene-based collapsing analysis.The red dashed line indicates the significance threshold (P = 2.51 × 10 −6 ).We only showed the results of the collapsing test with four models including two MAF cutoffs (0.01 and 0.001) and two variant annotation groups (LOF and LOF + missense), and the other results are shown in S6 Table.All models were adjusted for age, sex, and the top 10 PCs.(B) Bar chart showing carrier frequencies for rare LOF and missense variants for three identified rare variant genes.(C) Disease prevalence of LOF and missense variants in VV-related rare variant genes.(D) Heritability of the burden of genetic variants in VV.Abbreviation: LOF, loss-of-function.https://doi.org/10.1371/journal.pgen.1011339.g002

Fig 3 .Fig 4 .
Fig 3. Enrichment and expression analysis of significant effect associations in VV. (A) Results of the functional enrichment in biological pathway databases (P < 0.05).(B) Results of the functional enrichment in biological pathway databases in females.Only the top 12 significant pathways (P < 0.05) are shown, and the complete results are shown in S16 Table.(C) The heat map shows the tissue enrichment results of each VV-related gene.The H2AC6 gene was not recognized in FUMA Ensembl ID. (D) Single-cell RNA sequencing analysis results of adipose tissue.(E, F) Cell-specific expression of PIEZO1 and ECE1 in adipose tissue.Abbreviation: BP, biological process; CC, cellular component; MF, molecular function; TPM, transcripts per million; Umap, Uniform manifold approximation and projection.https://doi.org/10.1371/journal.pgen.1011339.g003

Fig 6 .
Fig 6.Quantitative traits and disease characteristics of PIEZO1.(A) Genes with the most similar quantitative trait characteristics to PIEZO1 in UK Biobank from Gene-SCOUT.(B) Signatures of PIEZO1 and its similar genes (similarity to PIEZO1 sorted from top to bottom).https://doi.org/10.1371/journal.pgen.1011339.g006 643 VV cases and 329,127 controls) with a mean age of 56.94 years at enrollment, of which 53.76% were female (S2 Table).The case and control groups had similar age distribution (S1 Fig).The clustering of the VV case and control groups in the principal component analysis is shown in S2 Fig.A total of 13,823,269 autosomal genetic variants were obtained after QC, including 100,098 common variants (MAF > 1%) and 13,723,171 rare variants (MAF < 1%).